# Bi-Level Knowledge Transfer for Multi-Task Multi-Agent Reinforcement Learning

This is the implementation of our proposed paper "Bi-Level Knowledge Transfer for Multi-Task Multi-Agent Reinforcement Learning".

## Installation instructions

### Install StarCraft II

```
bash install_sc2.sh

export SC2PATH=[Your SC2 folder like /mnt/123/3rdparty/StarCraftII]
```

### Add additional Maps of SMAC
```bash
git clone https://github.com/oxwhirl/smac.git
pip install -e smac/
bash install_smac_patch.sh
```

### Dataset
Thanks to ODIS, it collected a shared dataset with four different level qmix policy. Considering the rule of the use for urls, we provide the toy trajectories of `3m`. The full dataset collected from ODIS can be found in its repo and we will release it when our paper is accepted. To run our experiments, you should put the full dataset in the 'dataset' folder. 


### Install Python environment

Install Python environment with conda:

```bash
conda create -n bikt python=3.10 -y
conda activate bikt
pip install -r requirements.txt
```

### Run experiments
The whole training process is performed in three stages: Individual skill learning, team tactic learning, and the Decision-Making Learning.
The training processes are conducted in sequence.

* 1. Individual Skill Learning
```python
python src/main.py --bikt --config=bikt --env-config=sc2_offline --task-config=marine-hard-medium-expert-bikt --pretrain_vae=True --pretrain=True
```


* 2. Team Tactic Learning
```python
python src/main.py --bikt --config=bikt --env-config=sc2_offline --task-config=marine-hard-medium-expert-bikt --pretrain_vqvae=True
```


* 3. Decision-Making Learning
```python
python src/main.py --bikt --config=bikt --env-config=sc2_offline --task-config=marine-hard-expert-bikt --train_DT_w_glsk=True --skill_softmax=True 
```




